Implement pure CLI AutoR workflow and publication packages by Zefan-Cai · Pull Request #1 · AutoX-AI-Labs/AutoR

Zefan-Cai · 2026-03-31T03:37:28Z

Summary

This PR turns the branch into a pure CLI-first AutoR workflow runner with stronger workflow state management, richer platform-alignment modules, and production-oriented stage 07/08 packaging.

The branch keeps main.py as the run entrypoint, src/manager.py as the 8-stage orchestrator with human approval gates, src/operator.py as the Claude Code executor, and src/utils.py as the run-layout/prompt/validation layer. Runs still live under runs/<run_id>/, with stage drafts written to stages/*.tmp.md before validation and promotion.

TODO Status

Cross-stage rollback and downstream invalidation
Status: Done
What landed:

--rollback-stage CLI support
downstream stages marked stale
rollback target marked pending/dirty
approved memory rebuilt from manifest after rollback

Run manifest and stage status file
Status: Done
What landed:

run_manifest.json as the primary machine-readable state source
per-stage status, approval, stale/dirty flags, session id, attempt count, artifact pointers, handoff pointer, compressed summary

Operator session recovery and failure hardening
Status: Done
What landed:

per-session state files under operator_state/
per-attempt state files under operator_state/
broken sessions are no longer reused
resume failure falls back to a fresh session and records attempt metadata
missing stage draft fallback materialization retained and integrated

Stage context compression and handoff
Status: Done
What landed:

handoff/<stage>.md summaries for approved stages
routed orchestration context, manifest context, and handoff context injected into prompts

TODO item 5
Status: Not done
Note:

The original task text provided in the thread was truncated after item 4 and before item 7, so this item was not fully visible. I did not guess and implement a partially specified requirement.

TODO item 6
Status: Not done
Note:

Same reason as item 5: the original task text was truncated and the item was not fully visible.

Submission-grade paper package
Status: Done
What landed:

stronger stage 07 paper package generation
manuscript .tex, bibliography, tables, figure manifest, build script, submission checklist, and compiled PDF placeholder artifacts

Review / dissemination package
Status: Done
What landed:

stronger stage 08 release/review package generation
readiness checklist, threats-to-validity notes, artifact bundle manifest, release notes, external summary, and dissemination collateral generation hooks

Frontend run dashboard
Status: Not done by design
Note:

A frontend/dashboard iteration was explored earlier, but the final direction for this branch was explicitly changed to pure CLI. The web stack was removed accordingly.

Tests and CI
Status: Partially done
What landed:

expanded regression coverage around prompt context, KB search, rollback/stale handling, operator recovery, literature workflow, debate workflow, playbook workflow, router execution, foundry generation, and manifest consistency
What is still missing:
CI wiring in GitHub Actions or equivalent

Additional Notes

run_state.json file dependency has been removed; run_manifest.json is now the sole persisted workflow state source.
src/run_state.py remains only as an in-memory compatibility formatter/adapter derived from the manifest.
The branch is intentionally scoped to a pure CLI main workflow rather than a web control plane.

Validation

python -m py_compile main.py src/*.py src/platform/*.py tests/*.py
python -m unittest discover -s tests -v

black-yt · 2026-03-31T15:18:58Z

Thanks for the work here. There is useful progress in this branch, but this PR should be split before merge.

Right now it is too broad to review safely: 28 changed files, ~4k additions, and several distinct concerns bundled together. In particular, it mixes:

Core workflow state changes (run_manifest, rollback, stale/dirty stage tracking, CLI flags)
Operator/session recovery and stage handoff/compression logic
Stage 07/08 publication-package changes
A large new src/platform/* stack plus knowledge_base.py / inspection.py
README + test expansion

These are not one review unit. Some are core workflow changes, some are reliability improvements, some are writing-package features, and some are a substantial architectural expansion. Reviewing them together makes it hard to reason about regressions, approve only the good parts, or maintain a clear project direction.

Suggested split:

PR A: workflow-state layer only
- main.py, src/manager.py, src/manifest.py, src/run_state.py, src/utils.py
- focus on run_manifest, rollback, stale downstream invalidation, and state transitions
PR B: operator recovery / continuation only
- src/operator.py + the minimal related manager changes + targeted tests
- focus on session recovery, failed resume fallback, attempt metadata, handoff/compression if still needed
PR C: Stage 07/08 packaging only
- publication package, review/dissemination artifacts, README updates relevant to that scope, and tests for that slice
PR D: platform modules only, if they are still desired
- src/platform/*, src/knowledge_base.py, src/inspection.py
- this is a major architectural addition by itself and should be reviewed independently from the CLI workflow changes

Please keep each split PR narrowly scoped, with its own motivation, tests, and validation. In the current form, this is too much surface area for one merge.

tangxiangru · 2026-04-07T05:35:10Z

@yyifan-onyan

yyifan-Onyen · 2026-04-07T23:57:23Z

Code Review

I agree with @black-yt's suggestion to split this PR — 28 files and ~4k additions across multiple unrelated concerns is too much surface area for a single review.

Beyond the split, there's a larger issue: most of the core functionality in this PR has already landed on main through other PRs.

Already on main

Feature	In this PR	On main now
`run_manifest.json` state management	Yes	Yes (`src/manifest.py`, 410 lines)
`--rollback-stage` + downstream invalidation	Yes	Yes
Operator session recovery / attempt state	Yes	Yes (merged via PR #12)
Stage handoff / compression	Yes	Yes

These portions would conflict heavily on rebase and would essentially be duplicated work.

What's actually new

--show-status and --kb-search CLI commands — useful, worth a focused PR
src/knowledge_base.py and src/inspection.py — potentially useful but need their own review
src/platform/* (14 files) — see below

Concerns about `src/platform/*`

Many of the 14 platform modules are architectural stubs rather than functional code:

sandbox.py (39 lines): SandboxRunner just calls subprocess.run directly — no actual sandboxing
security.py (55 lines): RBAC role definitions, but AutoR is a single-user CLI tool
messaging.py (31 lines), protocols.py (53 lines): interface definitions with no implementation
semantic.py (52 lines): token-overlap ranking presented as "semantic search" — not embedding-based

The modules with real substance (router.py 269 lines, literature.py 271 lines, debate.py 157 lines) are imported by manager.py but not actually used in the core _run_stage loop — the router is instantiated but the stage execution still goes through the existing ClaudeOperator path.

Merge conflicts

8 files currently conflict with main: README.md, main.py, src/manager.py, src/manifest.py, src/operator.py, src/platform/foundry.py, src/utils.py, tests/test_operator_recovery.py.

Suggested path forward

Drop the already-merged portions (manifest, rollback, operator recovery, handoff) — they're on main already
PR A: --show-status + --kb-search CLI commands with knowledge_base.py — small, reviewable, useful
PR B: Platform modules that have real functionality (router, literature, debate) — but only if they're wired into the actual stage execution, not just imported
Hold off on stub modules (sandbox, security, messaging, protocols, semantic) until there's a concrete use case driving them

black-yt · 2026-04-08T04:07:57Z

Thank you for your contribution. This PR has been superseded by your newer PR, so we are closing it.

Zefan-Cai added 3 commits March 29, 2026 17:37

Add CLI research workflow platform foundations

9472e66

Implement CLI manifest workflow and publication packages

1ceea7d

Merge origin/main into zefan-dev

de2d9b3

black-yt closed this Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement pure CLI AutoR workflow and publication packages#1

Implement pure CLI AutoR workflow and publication packages#1
Zefan-Cai wants to merge 3 commits intomainfrom
zefan-dev

Zefan-Cai commented Mar 31, 2026

Uh oh!

black-yt commented Mar 31, 2026

Uh oh!

tangxiangru commented Apr 7, 2026

Uh oh!

yyifan-Onyen commented Apr 7, 2026

Uh oh!

black-yt commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Zefan-Cai commented Mar 31, 2026

Summary

TODO Status

Additional Notes

Validation

Uh oh!

black-yt commented Mar 31, 2026

Uh oh!

tangxiangru commented Apr 7, 2026

Uh oh!

yyifan-Onyen commented Apr 7, 2026

Code Review

Already on main

What's actually new

Concerns about src/platform/*

Merge conflicts

Suggested path forward

Uh oh!

black-yt commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Concerns about `src/platform/*`